CUDA: update build CTK version to 12.8 #13360

thevishalagarwal · 2025-05-07T17:00:58Z

update CUDA Toolkit version from 12.4 to 12.8 to support compilation of real arch sm120 Blackwell GPUs.

updated ggml-cuda/CMakeLists.txt to add compilation of sm120 arch for Blackwell GPUs
updated CUDAToolkit version from 12.4 to 12.8 for windows github CI build

thevishalagarwal · 2025-05-12T14:41:55Z

@JohannesGaessler @slaren @ggerganov ping for review

slaren · 2025-05-14T13:20:46Z

I am not sure that we need to add real arch 120 to the build. The criteria for selecting the real archs to include is based on what we expect to be the most commonly used GPUs to improve the load time in these cases, but at this point there are likely very few people with RTX 50 series GPUs, below 1% according to the steam hw survey.

thad0ctor · 2025-07-02T19:39:20Z

I am not sure that we need to add real arch 120 to the build. The criteria for selecting the real archs to include is based on what we expect to be the most commonly used GPUs to improve the load time in these cases, but at this point there are likely very few people with RTX 50 series GPUs, below 1% according to the steam hw survey.

why not? The percantage of people using blackwell for ai, particularly 5090s and RTX 6000 are probably disproportionate to the overall steam hw survey

- Add comprehensive 22-week implementation roadmap for Blackwell (compute capability 12.0) - Include detailed technical specifications with code examples - Focus on Flash Attention optimizations using Thread Block Clusters - Plan leverages enhanced L2 cache (126 MB) and HBM3/HBM3e memory - Build foundation already complete via PR ggml-org#13360 (CUDA 12.8 + sm120) - Target 20-40% Flash Attention improvement over Ada Lovelace Phase 1: Foundation and architecture detection (accelerated - complete) Phase 2: Thread Block Clusters implementation Phase 3: Flash Attention Blackwell optimizations Phase 4-7: Advanced features, validation, and integration

github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning labels May 7, 2025

thevishalagarwal added 2 commits May 12, 2025 17:18

cuda: compile sm120 for ctk 12.8

1307bb8

remove whitespaces

c54c98f

thevishalagarwal force-pushed the github-workflow/update-cuda-12.8 branch from e1db936 to c54c98f Compare May 12, 2025 11:53

update to ctk 12.8

d971e0a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA: update build CTK version to 12.8 #13360

CUDA: update build CTK version to 12.8 #13360

thevishalagarwal commented May 7, 2025 •

edited

Loading

Uh oh!

thevishalagarwal commented May 12, 2025

Uh oh!

slaren commented May 14, 2025

Uh oh!

thad0ctor commented Jul 2, 2025

Uh oh!

Uh oh!

CUDA: update build CTK version to 12.8 #13360

Are you sure you want to change the base?

CUDA: update build CTK version to 12.8 #13360

Conversation

thevishalagarwal commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thevishalagarwal commented May 12, 2025

Uh oh!

slaren commented May 14, 2025

Uh oh!

thad0ctor commented Jul 2, 2025

Uh oh!

Uh oh!

thevishalagarwal commented May 7, 2025 •

edited

Loading